Staged Training Report ✓ Complete

Run ID: test_plateau_fix
Generated: 2026-02-11 12:17:45
Stages Completed: 1
Total Elapsed Time: 00:33:15

Configuration

Config Defaults Changed Since Last Commit

ParameterPreviousCurrent
batch_size 1 4
divergence_patience 10 20
divergence_ratio 1.3 1.5
plateau_patience 20 15
plateau_sweep.plateau_patience 100 25
runs_per_stage 1 2
serial_runs None True
All Configuration Parameters (59 parameters)
ParameterValue
total_samples10000000
batch_size4
stage_samples_multiplier100000000000
update_interval250
window_size100
num_best_models_to_keep1
sampling_modeLoss-weighted
loss_weight_temperature0.5
loss_weight_refresh_interval50
stop_on_divergenceTrue
divergence_gap0.002
divergence_ratio1.5
divergence_patience20
divergence_min_updates5
val_spike_threshold2.0
val_spike_window15
val_spike_frequency0.75
val_plateau_patience100
val_plateau_min_delta0.0001
custom_lr0.0001
disable_lr_scalingTrue
custom_warmup-1
lr_min_ratio0.001
resume_warmup_ratio0.05
plateau_factor0.8
plateau_patience15
preserve_optimizerFalse
preserve_schedulerTrue
samples_modeTrain additional samples
num_random_obs_to_visualize2
selected_frame_offset3
runs_per_stage2
serial_runsTrue
clean_old_checkpointsTrue
enable_baselineFalse
baseline_runs_per_stage1
run_idtest_plateau_fix
enable_wandbTrue
wandb_projectdevelopmental-robot-movement
lr_sweep.lr_min1e-07
lr_sweep.lr_max0.01
lr_sweep.phase_a_num_candidates5
lr_sweep.phase_a_seeds1
lr_sweep.phase_a_time_budget_min3.0
lr_sweep.phase_a_survivor_count2
lr_sweep.phase_b_seeds3
lr_sweep.phase_b_time_budget_min10.0
lr_sweep.ranking_metricmedian_best_val
lr_sweep.min_samples_before_timeout1000
lr_sweep.min_evals_before_stop5
lr_sweep.save_sweep_stateTrue
plateau_sweep.enabledTrue
plateau_sweep.plateau_ema_alpha0.9
plateau_sweep.plateau_improvement_threshold0.0005
plateau_sweep.plateau_patience25
plateau_sweep.cooldown_updates5
plateau_sweep.max_sweeps_per_stage2
plateau_sweep.min_sweep_improvement0.0
stage_time_budget_min30.0

Timing Summary

Stage Plateau Sweeps Sweep Time Training Time Stage Total
Stage 10 1 00:13:47 00:10:16 00:24:04
TOTAL 1 00:13:47 00:10:16 00:24:04

Plateau Sweep Details

Total Sweeps: 1
Stages with Sweeps: 1 of 1
Total Sweep Time: 00:13:47
Average Sweep Duration: 00:13:47

Stage 10: 1 sweep

LR Progression: 1.0e-04 → 3.2e-05

Sweep # Triggered At (samples) Wall Time Selected LR Duration
1 54,684 00:08:15 3.16e-05 00:13:47

Stage Results

Stage Best Loss Stop Reason Samples Trained Time Sweeps LR (Initial→Final)
Stage 10 0.013036 divergence 13,104 00:24:04 1 1.0e-04→3.2e-05

Total Plateau Sweeps: 1

Stop Reason Breakdown

Loss Across Full Training Run

Loss Detail (Post Initial Drop)

Multi-Run Statistics

Total Runs: 2
Average Best Loss: 0.016618 ± 0.003582
Best Overall: 0.013036
Worst Overall: 0.020200

Stage 10 (2 runs)

Run Best Loss Stop Reason Samples Time Selected
1 0.020200 divergence 56,700 00:08:39
2 0.013036 divergence 13,104 00:24:04
Mean: 0.016618 ± 0.003582 Min: 0.013036 / Max: 0.020200 Range: 0.007164

Best Checkpoint

Name: best_model_auto_session_so101_should_pan_500_stage10_train_304_test_plateau_fix_00062748_cont_val_0.013036.pth
Stage: 10
Hybrid Loss (full session): 0.018743

Learning Rate Timeline with Plateau Sweeps

Stage Progression

Stage Orig Loss Train Loss Time Samples Stop Reason
10 ⭐ 0.018743 0.013036 00:24:04 13104 divergence

Hybrid Loss Over Original Session (per Stage)

Stage 10 (Best) - Hybrid Loss: 0.018743

Sample Counts

Cumulative Across All Stages

Per Stage

Stage 10 (Best) - Total Samples: 13,104

Best Checkpoint Inference

Selected Frame 3

Action 0

Action 1

Action 2

Random Observations

Observation 198

Action 0
Action 1
Action 2

Observation 249

Action 0
Action 1
Action 2